Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
1.
Nat Comput Sci ; 2024 May 10.
Artigo em Inglês | MEDLINE | ID: mdl-38730185

RESUMO

Single-cell epigenomic data has been growing continuously at an unprecedented pace, but their characteristics such as high dimensionality and sparsity pose substantial challenges to downstream analysis. Although deep learning models-especially variational autoencoders-have been widely used to capture low-dimensional feature embeddings, the prevalent Gaussian assumption somewhat disagrees with real data, and these models tend to struggle to incorporate reference information from abundant cell atlases. Here we propose CASTLE, a deep generative model based on the vector-quantized variational autoencoder framework to extract discrete latent embeddings that interpretably characterize single-cell chromatin accessibility sequencing data. We validate the performance and robustness of CASTLE for accurate cell-type identification and reasonable visualization compared with state-of-the-art methods. We demonstrate the advantages of CASTLE for effective incorporation of existing massive reference datasets in a weakly supervised or supervised manner. We further demonstrate CASTLE's capacity for intuitively distilling cell-type-specific feature spectra that unveil cell heterogeneity and biological implications quantitatively.

2.
Bioinformatics ; 40(5)2024 May 02.
Artigo em Inglês | MEDLINE | ID: mdl-38625746

RESUMO

MOTIVATION: With the rapid advancement of single-cell sequencing technology, it becomes gradually possible to delve into the cellular responses to various external perturbations at the gene expression level. However, obtaining perturbed samples in certain scenarios may be considerably challenging, and the substantial costs associated with sequencing also curtail the feasibility of large-scale experimentation. A repertoire of methodologies has been employed for forecasting perturbative responses in single-cell gene expression. However, existing methods primarily focus on the average response of a specific cell type to perturbation, overlooking the single-cell specificity of perturbation responses and a more comprehensive prediction of the entire perturbation response distribution. RESULTS: Here, we present scPRAM, a method for predicting perturbation responses in single-cell gene expression based on attention mechanisms. Leveraging variational autoencoders and optimal transport, scPRAM aligns cell states before and after perturbation, followed by accurate prediction of gene expression responses to perturbations for unseen cell types through attention mechanisms. Experiments on multiple real perturbation datasets involving drug treatments and bacterial infections demonstrate that scPRAM attains heightened accuracy in perturbation prediction across cell types, species, and individuals, surpassing existing methodologies. Furthermore, scPRAM demonstrates outstanding capability in identifying differentially expressed genes under perturbation, capturing heterogeneity in perturbation responses across species, and maintaining stability in the presence of data noise and sample size variations. AVAILABILITY AND IMPLEMENTATION: https://github.com/jiang-q19/scPRAM and https://doi.org/10.5281/zenodo.10935038.


Assuntos
Análise de Célula Única , Análise de Célula Única/métodos , Humanos , Perfilação da Expressão Gênica/métodos , Biologia Computacional/métodos , Algoritmos , Expressão Gênica
3.
Bioinform Adv ; 4(1): vbae055, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38645715

RESUMO

Summary: Chromatin accessibility serves as a critical measurement of physical contact between nuclear macromolecules and DNA sequence, providing valuable insights into the comprehensive landscape of regulatory mechanisms, thus we previously developed the OpenAnnotate web server. However, as an increasing number of epigenomic analysis software tools emerged, web-based annotation often faced limitations and inconveniences when integrated into these software pipelines. To address these issues, we here develop two software packages named OpenAnnotatePy and OpenAnnotateR. In addition to web-based functionalities, these packages encompass supplementary features, including the capability for simultaneous annotation across multiple cell types, advanced searching of systems, tissues and cell types, and converting the result to the data structure of mainstream tools. Moreover, we applied the packages to various scenarios, including cell type revealing, regulatory element prediction, and integration into mainstream single-cell ATAC-seq analysis pipelines including EpiScanpy, Signac, and ArchR. We anticipate that OpenAnnotateApi will significantly facilitate the deciphering of gene regulatory mechanisms, and offer crucial assistance in the field of epigenomic studies. Availability and implementation: OpenAnnotateApi for R is available at https://github.com/ZjGaothu/OpenAnnotateR and for Python is available at https://github.com/ZjGaothu/OpenAnnotatePy.

4.
Bioinformatics ; 40(4)2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38588573

RESUMO

SUMMARY: Recent technical advancements in single-cell chromatin accessibility sequencing (scCAS) have brought new insights to the characterization of epigenetic heterogeneity. As single-cell genomics experiments scale up to hundreds of thousands of cells, the demand for computational resources for downstream analysis grows intractably large and exceeds the capabilities of most researchers. Here, we propose EpiCarousel, a tailored Python package based on lazy loading, parallel processing, and community detection for memory- and time-efficient identification of metacells, i.e. the emergence of homogenous cells, in large-scale scCAS data. Through comprehensive experiments on five datasets of various protocols, sample sizes, dimensions, number of cell types, and degrees of cell-type imbalance, EpiCarousel outperformed baseline methods in systematic evaluation of memory usage, computational time, and multiple downstream analyses including cell type identification. Moreover, EpiCarousel executes preprocessing and downstream cell clustering on the atlas-level dataset with 707 043 cells and 1 154 611 peaks within 2 h consuming <75 GB of RAM and provides superior performance for characterizing cell heterogeneity than state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION: The EpiCarousel software is well-documented and freely available at https://github.com/biox-nku/epicarousel. It can be seamlessly interoperated with extensive scCAS analysis toolkits.


Assuntos
Cromatina , Análise de Célula Única , Software , Cromatina/metabolismo , Análise de Célula Única/métodos , Humanos , Genômica/métodos , Biologia Computacional/métodos
5.
Nat Commun ; 15(1): 2973, 2024 Apr 06.
Artigo em Inglês | MEDLINE | ID: mdl-38582890

RESUMO

Recent advancements for simultaneously profiling multi-omics modalities within individual cells have enabled the interrogation of cellular heterogeneity and molecular hierarchy. However, technical limitations lead to highly noisy multi-modal data and substantial costs. Although computational methods have been proposed to translate single-cell data across modalities, broad applications of the methods still remain impeded by formidable challenges. Here, we propose scButterfly, a versatile single-cell cross-modality translation method based on dual-aligned variational autoencoders and data augmentation schemes. With comprehensive experiments on multiple datasets, we provide compelling evidence of scButterfly's superiority over baseline methods in preserving cellular heterogeneity while translating datasets of various contexts and in revealing cell type-specific biological insights. Besides, we demonstrate the extensive applications of scButterfly for integrative multi-omics analysis of single-modality data, data enhancement of poor-quality single-cell multi-omics, and automatic cell type annotation of scATAC-seq data. Moreover, scButterfly can be generalized to unpaired data training, perturbation-response analysis, and consecutive translation.

6.
Artigo em Inglês | MEDLINE | ID: mdl-38442065

RESUMO

Rapid advances in single-cell chromatin accessibility sequencing (scCAS) technologies have enabled the characterization of epigenomic heterogeneity and increased the demand for automatic annotation of cell types. However, there are few computational methods tailored for cell type annotation in scCAS data and the existing methods perform poorly for differentiating and imbalanced cell types. Here, we propose CASCADE, a novel annotation method based on simulation- and denoising-based strategies. With comprehensive experiments on a number of scCAS datasets, we showed that CASCADE can effectively distinguish the patterns of different cell types and mitigate the effect of high noise levels, and thus achieve significantly better annotation performance for differentiating and imbalanced cell types. Besides, we performed model ablation experiments to show the contribution of modules in CASCADE and conducted extensive experiments to demonstrate the robustness of CASCADE to batch effect, imbalance degree, data sparsity, and number of cell types. Moreover, CASCADE significantly outperformed baseline methods for accurately annotating the cell types in newly sequenced data. We anticipate that CASCADE will greatly assist with characterizing cell heterogeneity in scCAS data analysis. The source codes and datasets are available at https://github.com/BioX-NKU/CASCADE/.

8.
Nat Commun ; 15(1): 1629, 2024 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-38388573

RESUMO

Single-cell chromatin accessibility sequencing (scCAS) has emerged as a valuable tool for interrogating and elucidating epigenomic heterogeneity and gene regulation. However, scCAS data inherently suffers from limitations such as high sparsity and dimensionality, which pose significant challenges for downstream analyses. Although several methods are proposed to enhance scCAS data, there are still challenges and limitations that hinder the effectiveness of these methods. Here, we propose scCASE, a scCAS data enhancement method based on non-negative matrix factorization which incorporates an iteratively updating cell-to-cell similarity matrix. Through comprehensive experiments on multiple datasets, we demonstrate the advantages of scCASE over existing methods for scCAS data enhancement. The interpretable cell type-specific peaks identified by scCASE can provide valuable biological insights into cell subpopulations. Moreover, to leverage the large compendia of available omics data as a reference, we further expand scCASE to scCASER, which enables the incorporation of external reference data to improve enhancement performance.


Assuntos
Algoritmos , Cromatina , Cromatina/genética , Epigenômica/métodos , Regulação da Expressão Gênica , Análise de Célula Única
10.
Brief Bioinform ; 25(1)2023 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-38113078

RESUMO

Single-cell chromatin accessibility sequencing (scCAS) technologies have enabled characterizing the epigenomic heterogeneity of individual cells. However, the identification of features of scCAS data that are relevant to underlying biological processes remains a significant gap. Here, we introduce a novel method Cofea, to fill this gap. Through comprehensive experiments on 5 simulated and 54 real datasets, Cofea demonstrates its superiority in capturing cellular heterogeneity and facilitating downstream analysis. Applying this method to identification of cell type-specific peaks and candidate enhancers, as well as pathway enrichment analysis and partitioned heritability analysis, we illustrate the potential of Cofea to uncover functional biological process.


Assuntos
Cromatina , Sequências Reguladoras de Ácido Nucleico , Cromatina/genética
11.
Genome Res ; 33(10): 1757-1773, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37903634

RESUMO

Rapid advances in spatial transcriptomics (ST) have revolutionized the interrogation of spatial heterogeneity and increase the demand for comprehensive methods to effectively characterize spatial domains. As a prerequisite for ST data analysis, spatial domain characterization is a crucial step for downstream analyses and biological implications. Here we propose a prior-based self-attention framework for spatial transcriptomics (PAST), a variational graph convolutional autoencoder for ST, which effectively integrates prior information via a Bayesian neural network, captures spatial patterns via a self-attention mechanism, and enables scalable application via a ripple walk sampler strategy. Through comprehensive experiments on data sets generated by different technologies, we show that PAST can effectively characterize spatial domains and facilitate various downstream analyses, including ST visualization, spatial trajectory inference and pseudotime analysis. Also, we highlight the advantages of PAST for multislice joint embedding and automatic annotation of spatial domains in newly sequenced ST data. Compared with existing methods, PAST is the first ST method that integrates reference data to analyze ST data. We anticipate that PAST will open up new avenues for researchers to decipher ST data with customized reference data, which expands the applicability of ST technology.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Teorema de Bayes , Redes Neurais de Computação , Análise Espacial
12.
Genome Biol ; 24(1): 225, 2023 10 09.
Artigo em Inglês | MEDLINE | ID: mdl-37814314

RESUMO

Application of the widely used droplet-based microfluidic technologies in single-cell sequencing often yields doublets, introducing bias to downstream analyses. Especially, doublet-detection methods for single-cell chromatin accessibility sequencing (scCAS) data have multiple assay-specific challenges. Therefore, we propose scIBD, a self-supervised iterative-optimizing model for boosting heterotypic doublet detection in scCAS data. scIBD introduces an adaptive strategy to simulate high-confident heterotypic doublets and self-supervise for doublet-detection in an iteratively optimizing manner. Comprehensive benchmarking on various simulated and real datasets demonstrates the outperformance and robustness of scIBD. Moreover, the downstream biological analyses suggest the efficacy of doublet-removal by scIBD.


Assuntos
Cromatina , Análise de Célula Única , Análise de Célula Única/métodos
13.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37494428

RESUMO

MOTIVATION: Single-cell chromatin accessibility sequencing (scCAS) technology provides an epigenomic perspective to characterize gene regulatory mechanisms at single-cell resolution. With an increasing number of computational methods proposed for analyzing scCAS data, a powerful simulation framework is desirable for evaluation and validation of these methods. However, existing simulators generate synthetic data by sampling reads from real data or mimicking existing cell states, which is inadequate to provide credible ground-truth labels for method evaluation. RESULTS: We present simCAS, an embedding-based simulator, for generating high-fidelity scCAS data from both cell- and peak-wise embeddings. We demonstrate simCAS outperforms existing simulators in resembling real data and show that simCAS can generate cells of different states with user-defined cell populations and differentiation trajectories. Additionally, simCAS can simulate data from different batches and encode user-specified interactions of chromatin regions in the synthetic data, which provides ground-truth labels more than cell states. We systematically demonstrate that simCAS facilitates the benchmarking of four core tasks in downstream analysis: cell clustering, trajectory inference, data integration, and cis-regulatory interaction inference. We anticipate simCAS will be a reliable and flexible simulator for evaluating the ongoing computational methods applied on scCAS data. AVAILABILITY AND IMPLEMENTATION: simCAS is freely available at https://github.com/Chen-Li-17/simCAS.


Assuntos
Cromatina , Regulação da Expressão Gênica , Simulação por Computador , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Célula Única/métodos
14.
Cells ; 12(4)2023 02 13.
Artigo em Inglês | MEDLINE | ID: mdl-36831270

RESUMO

Recent advances in spatial transcriptomics have revolutionized the understanding of tissue organization. The identification of spatially variable genes (SVGs) is an essential step for downstream spatial domain characterization. Although several methods have been proposed for identifying SVGs, inadequate ability to decipher spatial domains, poor efficiency, and insufficient interoperability with existing standard analysis workflows still impede the applications of these methods. Here we propose SINFONIA, a scalable method for identifying spatially variable genes via ensemble strategies. Implemented in Python, SINFONIA can be seamlessly integrated into existing analysis workflows. Using 15 spatial transcriptomic datasets generated with different protocols and with different sizes, dimensions and qualities, we show the advantage of SINFONIA over three baseline methods and two variants via systematic evaluation of spatial clustering, domain resolution, latent representation, spatial visualization, and computational efficiency with 21 quantitative metrics. Additionally, SINFONIA is robust relative to the choice of the number of SVGs. We anticipate SINFONIA will facilitate the analysis of spatial transcriptomics.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma
15.
Bioinformatics ; 39(1)2023 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-36610708

RESUMO

SUMMARY: Recent innovations in single-cell chromatin accessibility sequencing (scCAS) have revolutionized the characterization of epigenomic heterogeneity. Estimation of the number of cell types is a crucial step for downstream analyses and biological implications. However, efforts to perform estimation specifically for scCAS data are limited. Here, we propose ASTER, an ensemble learning-based tool for accurately estimating the number of cell types in scCAS data. ASTER outperformed baseline methods in systematic evaluation on 27 datasets of various protocols, sizes, numbers of cell types, degrees of cell-type imbalance, cell states and qualities, providing valuable guidance for scCAS data analysis. AVAILABILITY AND IMPLEMENTATION: ASTER along with detailed documentation is freely accessible at https://aster.readthedocs.io/ under the MIT License. It can be seamlessly integrated into existing scCAS analysis workflows. The source code is available at https://github.com/biox-nku/aster. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Cromatina , Software , Epigenômica , Documentação , Fluxo de Trabalho
16.
Brief Bioinform ; 24(1)2023 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-36513377

RESUMO

Single-cell analysis is a valuable approach for dissecting the cellular heterogeneity, and single-cell chromatin accessibility sequencing (scCAS) can profile the epigenetic landscapes for thousands of individual cells. It is challenging to analyze scCAS data, because of its high dimensionality and a higher degree of sparsity compared with scRNA-seq data. Topic modeling in single-cell data analysis can lead to robust identification of the cell types and it can provide insight into the regulatory mechanisms. Reference-guided approach may facilitate the analysis of scCAS data by utilizing the information in existing datasets. We present RefTM (Reference-guided Topic Modeling of single-cell chromatin accessibility data), which not only utilizes the information in existing bulk chromatin accessibility and annotated scCAS data, but also takes advantage of topic models for single-cell data analysis. RefTM simultaneously models: (1) the shared biological variation among reference data and the target scCAS data; (2) the unique biological variation in scCAS data; (3) other variations from known covariates in scCAS data.


Assuntos
Cromatina , Cromatina/genética
17.
iScience ; 25(8): 104790, 2022 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-35992073

RESUMO

Complex traits such as cardiovascular diseases (CVD) are the results of complicated processes jointly affected by genetic and environmental factors. Genome-wide association studies (GWAS) identified genetic variants associated with diseases but usually did not reveal the underlying mechanisms. There could be many intermediate steps at epigenetic, transcriptomic, and cellular scales inside the black box of genotype-phenotype associations. In this article, we present a machine-learning-based cross-scale framework GRPath to decipher putative causal paths (pcPaths) from genetic variants to disease phenotypes by integrating multiple omics data. Applying GRPath on CVD, we identified 646 and 549 pcPaths linking putative causal regions, variants, and gene expressions in specific cell types for two types of heart failure, respectively. The findings suggest new understandings of coronary heart disease. Our work promoted the modeling of tissue- and cell type-specific cross-scale regulation to uncover mechanisms behind disease-associated variants, and provided new findings on the molecular mechanisms of CVD.

18.
Front Pediatr ; 10: 846560, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35874593

RESUMO

Background: Maternal stress during pregnancy can raise the risk of mental disorders in offspring. The continuous emergence of clinical concepts and the introduction of new technologies are great challenges. In this study, through bibliometric analysis, the research trends and hotspots on prenatal stress (PS) were explored to comprehend clinical treatments and recommend future scientific research directions. Methods: Studies on PS published on the Web of Science Core Collection (WoSCC) database between 2011 and 2021 were reviewed. Bibliometric analysis was conducted according to the number of publications, keywords, journals, citations, affiliations, and countries. With the data collected from the WoSCC, visualization of geographic distribution; clustering analysis of keywords, affiliations, and authors; and descriptive analysis and review of PS were carried out. Results: A total of 7,087 articles published in 2011-2021 were retrieved. During this period, the number of publications increased. Psychoneuroendocrinology is the leading journal on PS. The largest contributor was the United States. The University of California system was leading among institutions conducting relevant research. Wang H, King S, and Tain YL were scholars with significant contributions. Hotspots were classified into four clusters, namely, pregnancy, prenatal stress, oxidative stress, and growth. Conclusion: The number of studies on PS increased. Journals, countries, institutions, researchers with the most contributions, and most cited articles worldwide were extracted. Studies have mostly concentrated on treating diseases, the application of new technologies, and the analysis of epidemiological characteristics. Multidisciplinary integration is becoming the focus of current development. Epigenetics is increasingly used in studies on PS. Thus, it constitutes a solid foundation for future clinical medical and scientific research.

19.
J Psychiatr Res ; 151: 17-24, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35427874

RESUMO

Numerous studies have shown that prenatal stress (PS) induces learning and memory deficits in offspring, yet the specific mechanisms and effective interventions remain limited. Chewing has been known as one of the active coping strategies to suppress stress, but its effects during PS on learning and memory are unknown. The purpose of this study was to investigate the role of hippocampal AMPA receptors in the adverse effects of PS on spatial learning and memory, and whether chewing during PS could prevent these effects in prenatally stressed adult offspring rats. Prenatal restraint stress with or without chewing to dams during the day 11-20 of pregnancy was used to analyze the impact of different treatments for offspring. The spatial learning and memory were tested by the Morris water maze. The mRNA and protein expression of AMPA receptors in the hippocampus were measured by qRT-PCR and Western blot, respectively. The methylation of AMPA receptors was detected by bisulfite sequencing PCR. Our results revealed that PS impaired spatial learning acquisition and memory retrieval in adult offspring rats, but chewing could relieve this effect. Hippocampal GluA1-4 expression was significantly reduced in prenatally stressed offspring, while there were no changes in the methylation level of GluA2 and GluA4 promoters. Moreover, chewing increased PS-induced suppression of AMPA receptors in the hippocampus. In short, hippocampal AMPA receptors mediate the impairment of spatial learning and memory in prenatally stressed offspring, whereas chewing during PS could ameliorate PS-induced memory deficits.


Assuntos
Efeitos Tardios da Exposição Pré-Natal , Aprendizagem Espacial , Animais , Feminino , Hipocampo , Humanos , Aprendizagem em Labirinto , Transtornos da Memória/etiologia , Transtornos da Memória/metabolismo , Gravidez , Ratos , Receptores de AMPA/metabolismo , Memória Espacial , Estresse Psicológico/complicações , Estresse Psicológico/metabolismo
20.
Mol Ther Nucleic Acids ; 26: 732-748, 2021 Dec 03.
Artigo em Inglês | MEDLINE | ID: mdl-34703655

RESUMO

Because current mainstream anti-glycolipid GD2 therapeutics for neuroblastoma (NB) have limitations, such as severe adverse effects, improved therapeutics are needed. In this study, we developed a GD2 aptamer (DB99) and constructed a GD2-aptamer-mediated multifunctional nanomedicine (ANM) with effective, precise, and biocompatible properties, which functioned both as chemotherapy and as gene therapy for NB. DB99 can bind to GD2+ NB tumor cells but has minimal cross-reactivity to GD2- cells. Furthermore, ANM is formulated by self-assembly of synthetic aptamers DB99 and NB-specific MYCN small interfering RNA (siRNA), followed by self-loading of the chemotherapeutic agent doxorubicin (Dox). ANM is capable of specifically recognizing, binding, and internalizing GD2+, but not GD2-, NB tumor cells in vitro. Intracellular delivery of ANM activates Dox release for chemotherapy and MYCN-siRNA-induced MYCN silencing. ANM specifically targets, and selectively accumulates in, the GD2+ tumor site in vivo and further induces growth inhibition of GD2+ tumors in vivo; in addition, ANM generates fewer or no side effects in healthy tissues, resulting in markedly longer survival with fewer adverse effects. These results suggest that the GD2-aptamer-mediated, targeted drug delivery system may have potential applications for precise treatment of NB.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA